现代深层神经网络在医学图像分割任务中取得了显着进展。然而,最近观察到他们倾向于产生过于自信的估计,即使在高度不确定性的情况下,导致校准差和不可靠的模型。在这项工作中,我们介绍了错误的预测(MEEP)的最大熵,分割网络的培训策略,这些网络选择性地惩罚过度自信预测,仅关注错误分类的像素。特别是,我们设计了一个正规化术语,鼓励出于错误的预测,增加了复杂场景中的网络不确定性。我们的方法对于神经结构不可知,不会提高模型复杂性,并且可以与多分割损耗功能耦合。我们在两个具有挑战性的医学图像分割任务中将拟议的策略基准:脑磁共振图像(MRI)中的白质超强度病变,心脏MRI中的心房分段。实验结果表明,具有标准分割损耗的耦合MEEP不仅可以改善模型校准,而且还导致分割质量。
translated by 谷歌翻译
The correct functioning of photovoltaic (PV) cells is critical to ensuring the optimal performance of a solar plant. Anomaly detection techniques for PV cells can result in significant cost savings in operation and maintenance (O&M). Recent research has focused on deep learning techniques for automatically detecting anomalies in Electroluminescence (EL) images. Automated anomaly annotations can improve current O&M methodologies and help develop decision-making systems to extend the life-cycle of the PV cells and predict failures. This paper addresses the lack of anomaly segmentation annotations in the literature by proposing a combination of state-of-the-art data-driven techniques to create a Golden Standard benchmark. The proposed method stands out for (1) its adaptability to new PV cell types, (2) cost-efficient fine-tuning, and (3) leverage public datasets to generate advanced annotations. The methodology has been validated in the annotation of a widely used dataset, obtaining a reduction of the annotation cost by 60%.
translated by 谷歌翻译
While skin cancer classification has been a popular and valuable deep learning application for years, there has been little consideration of the context in which testing images are taken. Traditional melanoma classifiers rely on the assumption that their testing environments are analogous to the structured images on which they are trained. This paper combats this notion, arguing that mole size, a vital attribute in professional dermatology, is a red herring in automated melanoma detection. Although malignant melanomas are consistently larger than benign melanomas, this distinction proves unreliable and harmful when images cannot be contextually scaled. This implementation builds a custom model that eliminates size as a training feature to prevent overfitting to incorrect parameters. Additionally, random rotation and contrast augmentations are performed to simulate the real-world use of melanoma detection applications. Several custom models with varying forms of data augmentation are implemented to demonstrate the most significant features of the generalization abilities of mole classifiers. These implementations show that user unpredictability is crucial when utilizing such applications. The caution required when manually modifying data is acknowledged, as data loss and biased conclusions are necessary considerations in this process. Additionally, mole size inconsistency and its significance are discussed in both the dermatology and deep learning communities.
translated by 谷歌翻译
Integration of multiple sensor modalities and deep learning into Simultaneous Localization And Mapping (SLAM) systems are areas of significant interest in current research. Multi-modality is a stepping stone towards achieving robustness in challenging environments and interoperability of heterogeneous multi-robot systems with varying sensor setups. With maplab 2.0, we provide a versatile open-source platform that facilitates developing, testing, and integrating new modules and features into a fully-fledged SLAM system. Through extensive experiments, we show that maplab 2.0's accuracy is comparable to the state-of-the-art on the HILTI 2021 benchmark. Additionally, we showcase the flexibility of our system with three use cases: i) large-scale (approx. 10 km) multi-robot multi-session (23 missions) mapping, ii) integration of non-visual landmarks, and iii) incorporating a semantic object-based loop closure module into the mapping framework. The code is available open-source at https://github.com/ethz-asl/maplab.
translated by 谷歌翻译
Domain shift is a well-known problem in the medical imaging community. In particular, for endoscopic image analysis where the data can have different modalities the performance of deep learning (DL) methods gets adversely affected. In other words, methods developed on one modality cannot be used for a different modality. However, in real clinical settings, endoscopists switch between modalities for better mucosal visualisation. In this paper, we explore the domain generalisation technique to enable DL methods to be used in such scenarios. To this extend, we propose to use super pixels generated with Simple Linear Iterative Clustering (SLIC) which we refer to as "SUPRA" for SUPeRpixel Augmented method. SUPRA first generates a preliminary segmentation mask making use of our new loss "SLICLoss" that encourages both an accurate and color-consistent segmentation. We demonstrate that SLICLoss when combined with Binary Cross Entropy loss (BCE) can improve the model's generalisability with data that presents significant domain shift. We validate this novel compound loss on a vanilla U-Net using the EndoUDA dataset, which contains images for Barret's Esophagus and polyps from two modalities. We show that our method yields an improvement of nearly 25% in the target domain set compared to the baseline.
translated by 谷歌翻译
To apply federated learning to drug discovery we developed a novel platform in the context of European Innovative Medicines Initiative (IMI) project MELLODDY (grant n{\deg}831472), which was comprised of 10 pharmaceutical companies, academic research labs, large industrial companies and startups. The MELLODDY platform was the first industry-scale platform to enable the creation of a global federated model for drug discovery without sharing the confidential data sets of the individual partners. The federated model was trained on the platform by aggregating the gradients of all contributing partners in a cryptographic, secure way following each training iteration. The platform was deployed on an Amazon Web Services (AWS) multi-account architecture running Kubernetes clusters in private subnets. Organisationally, the roles of the different partners were codified as different rights and permissions on the platform and administrated in a decentralized way. The MELLODDY platform generated new scientific discoveries which are described in a companion paper.
translated by 谷歌翻译
这项工作是在培训生成动作/视频识别模型上,其输出是描述视频的自由形式的特定动作标题(而不是动作类标签)。生成的方法具有实用的优势,例如生产更细粒度和人类可读的产出,并且自然而然地是开放的。为此,我们提议适应视频/动作识别的预先训练的生成视觉和语言(V&L)基础模型。据我们所知,最近有几次尝试适应了用对比度学习(例如剪辑)训练的V&L模型(例如剪辑),但据我们所知,我们提出了第一种设定实现这一目标的方法来实现生成模型的方法。我们首先表明,生成模型的直接微调生产具有严重过度拟合的动作类别。为了减轻这一点,我们介绍了REST,这是一个由两个关键组成部分组成的培训框架:一种无监督的方法,用于通过伪捕获生成和自我训练,将生成模型适应动作/视频,即不使用任何动作特定的标签; (b)基于剪辑的检索方法,用于为每个视频发现一套伪装的伪扣,以训练该模型。重要的是,我们表明这两个组件对于获得高精度都是必要的。我们评估零拍动识别的问题的休息,我们表明,与基于对比的学习方法相比,我们的方法非常有竞争力。代码将可用。
translated by 谷歌翻译
新的AUV技术的发展增加了AUV可以应对的任务范围及其运营的长度。结果,AUV能够处理高度复杂的操作。但是,这些任务并不容易适合将任务定义为一系列预先计划的航路点的传统方法,因为不可能事先知道,在任务过程中可能发生的一切。这会导致操作员的期望和实际操作绩效之间存在差距。因此,这可能会在操作员和AUV之间产生降低的信任程度,从而导致不必要的任务中断。为了弥合机器人行为和运营商的期望之间的这一差距,这项工作旨在提供一个框架,以易于理解的方式解释任务期间自动驾驶汽车采取的决策和行动。此外,目的是拥有一个自治性系统,可以在任何自治体系结构之上添加为附加层。为了使该方法适用于配备不同自主权的不同自主系统,这项工作将自主权的内部运作与决策点以及应用知识蒸馏的由此产生的执行动作。最后,为了以更自然的方式向操作员介绍解释,蒸馏决策树的输出与自然语言解释相结合,并将其报告给操作员作为句子。因此,在解释管道的末尾添加了一个称为Concept2Text生成的附加步骤。
translated by 谷歌翻译
来自光场的大量空间和角度信息允许开发多种差异估计方法。但是,对光场的获取需要高存储和处理成本,从而限制了该技术在实际应用中的使用。为了克服这些缺点,压缩感应(CS)理论使光学体系结构的开发能够获得单个编码的光场测量。该测量是使用需要高计算成本的优化算法或深神经网络来解码的。从压缩光场进行的传统差异估计方法需要首先恢复整个光场,然后再恢复后处理步骤,从而需要长时间。相比之下,这项工作提出了通过省略传统方法所需的恢复步骤来从单个压缩测量中进行快速差异估计。具体而言,我们建议共同优化用于获取单个编码光场快照和卷积神经网络(CNN)的光学体系结构,以估计差异图。在实验上,提出的方法估计了与使用深度学习方法重建的光场相当的差异图。此外,所提出的方法在训练和推理方面的速度比估计重建光场差异的最佳方法要快20倍。
translated by 谷歌翻译
许多应用程序要求机器人在与人类或其他机器人等其他代理商共享的环境中运行。但是,这种共享场景通常会受到不同种类的长期语义场景的变化。因此,建模和预测这种变化的能力对于机器人自主权至关重要。在这项工作中,我们将语义场景变异性估计的任务形式化,并确定语义场景的三个主要品种变化:对象的位置,其语义状态或整个场景的组成。为了表示这种可变性,我们提出了可变场景图(VSG),该图表图具有可变性属性的现有3D场景图(SG)表示,代表离散长期变更事件的可能性。我们提出了一种新颖的方法Deltavsg,以估计以监督方式估计VSG的可变性。我们在3RSCAN长期数据集上评估了我们的方法,显示了这项新型任务对现有方法的显着改进。我们的方法Deltavsg的精度为72.2%,召回66.8%,通常模仿人类关于室内场景如何随着时间变化的直觉。我们进一步显示了VSG预测在主动机器人变更检测任务中的实用性,与场景变化 - 诺瓦尔计划者相比,任务完成加快了62.4%。我们将代码作为开源。
translated by 谷歌翻译